统计研究 ›› 2021, Vol. 38 ›› Issue (5): 109-120.doi: 10.19343/j.cnki.11-1302 /c.2021.05.009

• • 上一篇    下一篇

稀疏非线性函数型可加模型的变量选择

白永昕 田茂再   

  • 出版日期:2021-05-25 发布日期:2021-05-25

Variable Selection for Sparse Nonlinear Functional Additive Model

Bai Yongxin Tian Maozai   

  • Online:2021-05-25 Published:2021-05-25

摘要: 本文研究了响应变量和协变量均为函数型数据的非线性可加模型的变量选择问题。 首先,基于函数型距离相关系数,本文构造了一个F检验统计量 对协变量和残差的函数型距离相关系数进行排序并对最大相关系数所对应的协变量与残差进行独立性 F 检验,选择满足条件的新变量纳入到模型。其次,对每个新变量纳入模型后的贡献进行评估,从而确认新变量最终是否应该纳入模型。 这种变量选择方法通过不依赖模型的方法选择候选变量,将变量选择和模型估计分开,可以降低回归中协变量的维度。同时,在迭代过程中利用残差可以获取模型的相关信息,从而提高变量选择的准确度。最?后,本文通过模拟研究对所提变量选择方法的表现进行评价,并进一步通过一个家电能耗数据来验证所提的方法。

关键词: 函数型响应变量, 非线性可加模型, 变量选择

Abstract:



In this paper, we consider variable selection for the nonlinear additive model whose response variable and covariate are both functional data. First, we construct an F test statistic based on the functional distance correlation and sort the functional distance correlation coefficients of the covariates and residuals. Also, the independent F test is conducted for the covariates and residuals corresponding to the maximum correlation coefficients, and the qualified new variables are included in the model. Second, the contribution of each new variable in the model is evaluated to determine whether the new variable should eventually be included in the model. The procedure separates the selection process from the model estimation by picking the candidates using a model-free measurement, which can reduce the dimension of the covariates in the regression. At the same time, the residual error can be used to obtain the relevant information of the model during the iteration process to improve the accuracy of variable selection. Finally, the performance of the proposed variable selection procedure is assessed with Monte Carlo simulation studies. We further demonstrate the proposed procedure with a dataset of the energy consumption of appliances.

Key words: Functional Response Variables, Nonlinear Additive Model, Variable Selection